Defining the Eukalyptus forest - the Koala treebank of Swedish
نویسندگان
چکیده
This paper details the design of the lexical and syntactic layers of a new annotated corpus of Swedish contemporary texts. In order to make the corpus adaptable into a variety of representations, the annotation is of a hybrid type with head-marked constituents and function-labeled edges, and with a rich annotation of non-local dependencies. The source material has been taken from public sources, to allow the resulting corpus to be made freely available.
منابع مشابه
Logistics and pretreatment of forest biomass
In large regions of the world, biomass is a very important source of energy. The global bioenergy market based on forest biomass is growing rapidly. About 92 % of the bioenergy in Sweden comes from forests. Biomass from forests is not homogenous. The locations, transport distances, and transport methods, can differ very much and the industries that need biomass as input prefer raw materials wit...
متن کاملA Multi-domain Corpus of Swedish Word Sense Annotation
We describe the word sense annotation layer in Eukalyptus, a freely available five-domain corpus of contemporary Swedish with several annotation layers. The annotation uses the SALDO lexicon to define the sense inventory, and allows word sense annotation of compound segments and multiword units. We give an overview of the new annotation tool developed for this project, and finally present an an...
متن کاملA game theory approach to the Iranian forest industry raw material market
Dynamic game theory is applied to analyze the timber market in northern Iran as a duopsony. The Nash equilibrium and the dynamic properties of the system based on marginal adjustments are determined. When timber is sold, the different mills use mixed strategies to give sealed bids. It is found that the decision probability combination of the different mills follow a special form of attractor an...
متن کاملWhat kinds of trees grow in Swedish soil? A Comparison of Four Annotation Schemes for Swedish
One of the issues brought up in this workshop concerns the relationship between the syntactic properties of a given language and the choice of linguistic theory for annotation purposes. Our Swedish treebank consortium, consisting of researchers from Växjö University, KTH and Stockholm University, is currently facing a specific instance of this issue in trying to define an annotation standard fo...
متن کاملConverting an English-Swedish Parallel Treebank to Universal Dependencies
The paper reports experiences of automatically converting the dependency analysis of the LinES English-Swedish parallel treebank to universal dependencies (UD). The most tangible result is a version of the treebank that actually employs the relations and parts-of-speech categories required by UD, and no other. It is also more complete in that punctuation marks have received dependencies, which ...
متن کامل